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ABSTRACT "* \ • 

Noting that composition teachers would like computers 
in order to facilitate the mechanics/of writing, analyze text, and 
correct problems, this paper argues that classroom computer * 
applications are limited because computers cannot analyze text the 
way a reader would. The paper first posits that readers look. at text 
in terms of semantics and content while computers look at text in 
terms of syntax and form. It then illustrates this point by comparing 
spelling check programs with text analysis programs * It describes the 
three analysis programs — DICTION and STYLE, by Bell Labs, and the 
more powerful EPISTLE by IBM— ^nd how they function, and provides 
output from the first two. programs using the document to illustrate 
that the analysis is more or less useless to the writer-. The paper 
then discusses the possibilities that text analyzers can be made 
better, observing that the writer must be skilled in sorting through 
inaccurate output. in order for the output to be of any use,' and that 
thosei without such skills may be damaged as much as helped by such 
programs. The paper concludes with the observation that programs can 
be made more accurate, but that a more worthy goal would be c better 
on-line editors, better systems for communication , and better word 
processors. (HTH) 
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Limitations on the-Use of Computers in Composition 

v .As I understand the recent 1 i terature on computers- and 
composition, the basic question addressed is the following: 

,1. What applications do cpmputers have in the composition 
classroom? ' • • 



The literature/ indeed, has agreed on an answer: Today, word 
processing,, communications, invention, text analysis/ and idea 
processing; tomorrow, improvements on' those ;things and maybe a lot 
more. I am not happy with this answer. The list of current 
applications is confused, for the useful applications and the 
useless are lumped together indiscriminately. (Some of these 
> applications don't work and can't work; others "work , but not in aay 
way that is useful; still others work and are terrific.)' The . 
improvements and new applications being described (see' some other 
papers, in this volume) are far less promising than is .generally 
made out • 

I think it's time we started looking critically at existing 
applications and at promised future appl icat ions , asking whether 
they are really useful and if not, why not. If we are critical, we 
can, save a good deal of time and money today, by not spending it on 
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useless programs, and even more tomorrow, by not spending it on 
development of Impossible or ill-conc'eived applications. 

The problem,, of course., is how to look -critically -at 

applications.. There.- are many cut there, and each does slightly 

different things. If we tried to evaluate individual programs — if 

I tried- to argue my position by looking at individual examples of 

each program — we'd get. lost/ I propose that we can begin our 

criticism by developing criteria for what could/. in principle', be 

useful applications. We can, in 'other jwords, ask the following 

question, - a quest ion which is logically prior to the first 3 

question. ;. 

w ■* . . 

2. What applications can computers have in the .composition . 

classroom? ■ 

The first question is a question about what it would be pice to 
have computers do; the ' second is -a question about .what it is 
possible to have computers do. , *. 

5 Obviously the second should have been- .the primary quest ion' all 
along, since if you. answer the first without answering the" second, 
you waste time making up nice appl icat ions which' won 1 1 work. The 
second question, however, has not -been raised in the literature. I 
think I know why; People wr i t ing 'about computers and composition 
have assumed that computers can do anything. -\ 
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This, assumption is not true.' Computers work in- a: particular 
way, just .as automobiles do. The way they work makes them good at 
some things and bad .at others, just 'as cars are good at flying, . 
along freeways and bad at flying to the moon. Just -as with cars, 

' • I. 

if one wants-- to use computers' effect ively, ope has 'ta.be velry clear 
about what one wants' ta do with them (what one's purposes, are) and 
very clear about what can be done with them (what their 
/capabilities are)./ . . . 

What -are those purposes? ' In general, we (teachers of J 
composition) want computers to do two slightly different- things.. 
We would like computers to make the' mechanics . df . writing easier, by 
facilitating typing and formatting and .by making it easier to get 
access, to texts.- The capability that : we use when we turn computers 
to these purposes is the ability to manipulate text. This 
capability is used in word processors, formatters, -/and 
communications programs. We would also like computers to make the 
text itself , better by analyzing the text and correcting problems in 
it." The-, capability used here is^ the abi 1 ity to respond to or 
analyze, text . This capability is used in text analysis programs- 
(which I will look at closely in -the remainder of this paper), 
invention aids, and idea, processors. , In this paper, I want to 
argue that we can* use . the computer effectively for -the fir.st - 
purpose, but we can't 'for the second,- primarily because, computers 
are good at manipulating text, but bad "at analyzing it. Since 
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• » 

these capabilities won't change, new programs that make 
manipulation of text even easie-r are promis ing developments; 
programs that try to improve .the text won f t work out at all. 

• This is a strong claim, I admit, -and it's a diff icult one to 
make for the following reason; Essentially, I am saying that it is 
impossible for computers to do certain things. It is, however, 
difficult to prove othat Q is -.impossible; because the fact that Q 
hasn't been done is not sufficient proof. People who do think Q is 
possible will clailn that progress is being made toward- the goal. 
(They will also, in my experience, make nasty asides about how 
skeptics laughed at the Wright brothers. T The 'way to argue for 
this claim, therefore, is to look at the progress that made: if" 
progress. is steady, and a reaching the goal merely a matter of 
persistence, then it is reasonable to 'believe that the" goal., can be 
reached. If, on the other hand, progress is unsteady because 
researchers are constantly coming up against problems that .aren't 
solvable ^(problems, that is, that there is no reason, in principle, 
to think can be- solved) and if, in particular, researchers in many 
different areas are coming up against the same kind of problem, 
then the burden of proof falls on £Jje researchers, not the 
skeptics; • - . ' 
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- Syntax and . Semant ics ; . . • ' 

There is a problem with no obviou^*' principled solution, and 

* this problem is impeding progress. It's a very serious problem, 
because it' is built in to the way computers work; there's* no way- 
around the problem unless you change * computers . The problem is 
this. . When computers, try to analyze text — respond to it. is the, . 
term r will use from n~6w on— they cjon't do it the way we do. " v 

. When we look at a text , we respond to it on the basis of its 
meaning. (Semantics and content are more technical terms for the 
same thing that will be Used periodically from no^ on.); When * 
computers look at a text, thef respond to it on the basis of its 
sh§pe.« (Syntax and form are the equivalent technical ..tjerms . ) 
This difference exists no matter" what kind of text we are talking 
about . * ;. c g , 



Take, for instance, the. lowly letter "A." To me, the /letter * 

2 .,• • • • : 

is a meaningful object. \ When I type the following symbol, "A," I 

am typing the letter "A." When the computer responds to my typing 



1 

Don !t worry, what I mean by "shape" will get clearer. I take 
the word from Jerry Podor. See "Methodological Solipsism 
Considered as 'a Research Strategy in Cognitive ^Science" in John 
Haugeland, ed. Mind Design {Cambridge: - M.I.T. Press, 1981). 

2 • . / ' 

If I were Chinese, on the other 'hand, it would , not be 
meaningful; it would be "just a bunch of squiggles. 
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' by displaying the letter, it only displays the 'symbol "A."/ 'As -far 

as the computer is concerned, it is merely commanding the terminal- 

t6.«display. a*toit/pattern that corresponds to the ASCII code lO'i. 

You. can see the difference if you consider what would happen if we 

changed the ASCII code so that 101 corresponded to some Chinese 

character. When I type the letter "A;" the computer would then 

• • 3 

cheerfully display^ squiggle , rather t^han "A. " If the computer ' 

'•'•«•• \ • .. 

understood T as a letter, it would fyioV.. it-had made a mistake.. 

' • . \ *•» • • 

As • it is, the computer doesn't. '. \ " ' - 

' t The ASCII code* lb 1 is'a label-- -i syntactic indicator—of 'the 

•. . * ■ / • ■ • 

letter "A." \ The .computer .never responds to letters; all it ever 

.... . * . \ , ■ ■ ■■ 

does is respond to syntactic,, indicators , m to shape, not content.' 

5 ' * "... 

Whenever the computer, appears to be manipulating meaningful text, 
all it is actually doing is simulat'ir^g £/he manipulation by 
manipulating syntactic indicators df. the' text. In the case of 
4 single .letters^ there is a one-to-one correspondence between * 
- syntactic indicators and meaningful objects, ,so simulation is .quite 

0 , " ,. .... * 

easy and very exact. In other cases*, the simulatiqn is not quite 
so accurate. Responding to the form u and responding to the content 
occasionally produce different results: 
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Those -of you in the know recognize this as an arcane version of 
John Searle's .Chinese Room acgilment. See "Mihds, Brains, and 
Programs," The Brain and Behavioral Sciences (1981:3), pp. 417-457; 
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Take, for instance, the s imulat Ton done^ by. this word- 
processing program-. If I type the letter "A," the computer, 
reliably simulates my actian by displaying the group .of sqijiggles , 
"A." Consider, however, what happens when I?do something slightly 
more complex: I ask the computer to 'delete 4 this sentence S 
backwards. The sentence, the meaningful. /object, begins with the 
'word "Consider." The computer however , only deletes . backward to 
the. colon -before the syihbol "I." The computer looks for the' 
syntactic indicators of sentence beginnings,, a period or 'colon 
followed by two spaces, riot f or ( sentence . beg innings . (Reasonably 
enough, in the case of lisfts.)' , Whenever such S sequence does nob 

begin a sentence, or whenever a. sentence beginning' doesn't have 

' • " • . ■ ' ' * 

that seqaenc£'(l might mistype)^ the computerizes .hot' simulate*/"- 

accurately. '• t * .- 

If I were a computer programmer/ I could improve the_ ' 

simulation' by, finding more, accurate^yntactic indicators. I could, 

■ • ' "» 

for instance, label each noun, verb,' pronoun, and adverb in a 
putative sentence' (making the* parts of speech o£ the words 
recognizable 'by their lalael* and thus making, "them into syntactic 
objects) and have* |he computer look for word sequences with free 
nouns ancy- verbs preceded by. a period "'and any number of spaces. 
This would be difficult and the end 1 result would be slow, but the 
final product would be a more » accurate simulation. Not perfectly 
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accurate, of course,, but pretty good. * Sentences and letters are/ ' 
of course, much the best targets for a simulator because' we already 

e ' ■ 

have * syntactic convent ions" for label ing them. Objects which are, . 3 
jso'td speak.; more "meaningful" — less determined by syntactic : " - 
constraints — are"' much harder to simulate, and simulations, ' 
therefore -go. wrong more often. This is the problem mentioned^ 
above. The more meaningful a text, the more a correct, response to 
a text requires a response to the meaning 'of the text and- the more 
difficult it is to find. syntactic equivalents of the text which 
will allow a computer "to simulate the correct response. This^rule 
has ^several consequences, which'l present without proof. First, 
^hile any particular -text may in fact . receive the correct simulated 
response, it is not possible to build a system which can be. re*ied 
"on to respond correctly every time. - Second, the more^dif f icult the 
simulation, the less reliable it js. Third, in any application" 
^lere the simulation is unrel iable, the" user has to identify and 
correct the errors of the computer. -Fourth, when the user has to' 
judge the output 'before making use of it, most,, if not all of the. 
utility of the program-is lost. \ x " .■ 

, ' . \ V : ' ' * * 

*"• ' t 9 » 

* . s * '. • ' 

"* - * " . 

' ' : c . . '■ 

' ■ . . ■ ' „ o » 
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By perfect, I mean only that it would, -respond as a reasonably 
well-inf.ormed person would. People make' mistakes , too, but cf a * 
different, mo're* reasonable kind. .They are mistakes.in responding 
to - the meaning of a text . * / ' ,< 
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/*This rule depends on a notion of what "meaningful object" and 
"requires a response to the meaning of the text," is„ and without' 
going into/ a lot of technical philosophy/ those notions are 
difficult to -explain precisely. For my purposes, however, all you 
need is an intuit ive . feel ing for what the ideals mean, and that I 
tan give you with the examples 'in the' next section. There, I'll 
look at several different kinds of text analyzers, ranking them c 
according to how much they neecLto respond to the meaning of text- 
and showing how unrel iabi 1 i ty goes up with the difficulty of the 
- task . ----- — : — ; -—. . — " 

Text Analyzers : Spelling , Style , and Grammar Checkers • 

Programs which' perform- text analysis have been widely touted. 
They are,\t',s said, a way of rel ieving • both wrfter and teacher of 

unnecessary, painstaking work. They ; provide , moreover, a way of 

*» **■*'» < 

teaching students to perform this work, since the checkers providfe 

instant feedback on the students 1 actual papers. 

These claims certainly seem to have some merit in the^case of v 
spelling checkers. Spelling checkers, it appears, find misspelled 
words for the teache'f and writer, saving Jboth the work, of 
proofreading. Having a checker available makes it easier for bad 
spellers to learn how to spell s ince they provide feedback; it can 
even help good spellers. 
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Unfortunately, „ the appearance is rosier than the. reality. 
Spelling checkers aren't entirely accurate. The reason for the 
inaccuracy is that the spelling checkers don't work the way we do. 
When we check spelling, we look for meaningful objects: misspelled 
words. .Spelling checkers/ on the other hand, take each -discrete 
symbol string in a text and compare*. it to a list of discrete- symbol 
strings called a wordlist. I f " the* string is on- the wordlist, it is 
discarded; if it is not, it is sent to output, where we call it a/, 
misspelling. The difference in the way the two works produces 
inaccuracies. Words witch (e.g.) have been accidentally converted 
into other words are not caught. Neither are misspelled words that 
have managed to creep onto the wordlist. Correctly spelled words, 
onthe other hand, that are not on the list are caught. Caught,- 
tod, are non-standard, but acceptable spellings ("gage" for 
"gauge") and non-conventional forms of spelling (s-p-e-1-1). The- 
upshot: the list of un-recognized words doesn't correspond to- the 
list of misspelled words . . ' - ' - - \ 

"So'what," you might say, "the list is still useful." :Not as 
useful as you might think. Consider, for instance, Eigure 1, which 
contains the list of unrecognized words, from the -second draft of • 
the longer version of .this paper. - To put - it mi Idly , there is a lot" 
of garbage, on this list;. Most, fortunately, is obvious ' garbage . 
The formatting instructions (Topmargin , Pageheading., etc.), the 
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Figure 1 : Output of Spell* Rout ine Run' on an Earlier 

Draft of this Paper 



proper names, and the obvious gaps in the word list (natatorium) , m 
fall into that category, Some, "however, is less obvious. "Roblem". 
and "tudent" se£m to_be typos; "teatime" and "brteadbox" might 
really be ..misspell ings . Unfortunately, . they ' re garbage too. 
"Roblem" and "tudent" are produced by vagaries in the UNIX 
operating system, not by me; "teatime-" and- "breadbox" ' are spelLed 
correctly. There are, in fact, no misspelled words on the list . 
Does that mean that there are no misspelled. words . in my text? By 
no means 
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So what have I gained by using the program? I've wasted ' some 
time running the program (which is, understandably, slow); I've 
wasted time hunting down apparent errors; I've learned nothing knew 
(oops); and I still have, to proofread. This is, 'in my experience, 
typical, and for that reason, I've stopped using - the program. 
Other program designs ;are somewhat easier to use—ve have an 
interactive speller, now ; — but the problem remains the same and so 
does the ' result . I don't use it.." ' . 

Let me pause for a moment and point out vthe ' structure, of my 
response to the program. Because the program is not accurate, I 
need to adjust my .reactions to the inaccuracy; the first' thing I 
must do is judge the correctness of the program's output. Usually, 
that K s. pretty easy to do;, but .inevitably, there are some points 
where my knowledge is insufficient, and I can ? t tell whether it's 
right. I then have to figure out who is right. Because I 
understand, spelling and 1 understand how "the computer works, the 
effort involved is not too great. Still, I must expend this effort 
^before I can begin to use the program. All-in all, I find'that 
using the sorting through the garbage is" not worth the effort, 

1 v 

especially since I -can't'" be sure that the output includes all the- 
possible mistakes.- - 

A normal response, to this -description . of my problem is the, 
following. . "Sure, you don't need a spell program. But what about' 
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the people who really can't spell?" . Yes, they have more 
inclination- to use . the program, and in the long run, they may use 
it more. But they, too/ have to sort through the garbage first;.. 
And for them, the -sorting is- considerably more difficult. >or one 
thing, the list is longer. For another, they know less about 
** .... spelling than I do, so there are more points where they, must check 
on the program. Remember, they are less able to tell the 
difference between words that just aren't on the list anct words 
• that are misspelled. So for thejn, the spelling checkers are much 
. more difficult to use. ? 

For the bad speller, .it is still worthwhile to expend the 
effort, but for all but the dedicated bad speller there are 
dangers.. 'Here's one 'I've seeKT^ The bad speller getsa list of 
.doubtful bad yords and rather than figure' -out when the, computer is 
wrong-, -he changes every entry to' some other word. He excises a 
word, froift his vocabulary rather than trying to figure out whether' 
... it is misspelled. • 0 

■ 1 * 

Why , despite' the problems, is . it worthwhile to use the spell b 

routines? There are three reasons: 

o 1 - 

- They are relatively accurate, so the garbage ratio is" 

• - ■■ .*„-.. : - 

fairly low.. 

-..They can be used to find inadvertent errors.- I often 

ERIC 17 
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accidetnally (e.g.) misspell words. - 4 " 

- The rules' they follow are strict. When a word is * r 
misspelled, it's misspelled; that's all there is to it. 

You can see, though/ that if any of these conditions are not • 
met, then it makes much less sense to do the work required to use 
the program... • ' _ _ 

-With other text analyzers ,, however, none of these conditions 
are true. The reason they are not true -is that Other text 
analyzers are looking for features of 'prose that "don't "have such 
simple simple syntactic indicators , features ■ of prose .th^t • are .more 
heavily determined* by the meaning . v 

To shqw you what I mean, let me take a look at two text 
analysis programs that are widely considered tcbe the state of the 
art: STYLE and DICTION, two of the* Writers Workbench programs put 
out by Bell Labs-. 

STYLE describes syntactic features of \your text:- number of 
words, number of sentences , kinds of . sentences , readability, etc. 
It does this pretty much the way I" descr ibed above . tfhe ^program 
.has "a list of parts .of speech rt of each word (these are the syntactic 
indicators) and a list of sequences of parts -of speech. It runs 
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your text through these lists and ends up wrth~a count of the ■ 

sequences. These are then ( analyzed and displayed. Accuracy is not 

great, as you mrght imagine,, but then again, accuracy is not re^\ly 

. ■ 5 . \ 

the issue with -this program.- Consider, for instance, the output 

.of the STYLE program run on. this article up to the beginning of the 

/previous paragraph. (Figure -2) • 

. \ • ■ . 

As you can seeV the issue is that the numbers 'don 't tell me. 
anything useful/ Given my purpose and audience, I can 1 1 tell 
whether a Flesch reading of lO'.O is good or bad. (It varies, by 
the way, by more than a grade level from draft to draft.) I can't 
tell whet'her I should increase- or decrease ' the number of 
non-functional words* I can't tell* whether 68% subject openers, is 
in -line or not. I'm getting syntactic information, all right, but 
the information is not useful unless I can relate it to the 
content, £>ut relating it to the content is just what this program 

can't do. It couldn't relate it to- the content unless there were 

0 ■ ■ .. . 

strict rules connecting syntax to semantics. (as there are with 
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Lorinda Cherry estimates that the ' type of each sentence is. 
identified accurately about 86.5% of the time.. Her sample, 
however, was only 20 technical documents. Authors of pr6grams, 
moreover, tend to exaggerate-, so we can assume.it is somewhat less. 
L. L. Cherry, and W. Vesterman, "Writing Tools - The style and' 
diction Programs," DICTION" in The ' UNIX User's Guide , (Murray Hill, 
New Jersey:. .Bell Laboratories, 1980 ) , p. 10. . 
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readability grades: .- . . . 

(Kincaid) 9.6 (auto) 9.9 (Coleman-Liau) 10.0 (FLesch) 
1 10. 1 ('59. 6) sentence info: 

no. sent 163 ho. wds 3081 
av sent reng*'18.9 av word leng 4.65 
no. questions ,6 no.- imperatives 0 '* 
no. nonfunc wds 1705 55.3% av" leng 6.12 
- short, sent (<J4) '37% , ( 60 ) long sent - ( >29 ) 11% (18) 
longest sent 84 wds at sent 1; shortest sent "4 wds at . 
sent 24- 
sentence types:- ' 

simple 38% (62) complex-~39% (63) 
compound 9% (15) compound-complex 14% (23) 
word usage : 

verb types as % of total, verbs 
tobe 41% (169). aux 18% (75) inf 15% (63) 
passives as % of non-inf verbs 10% (36.) 
types as % of total . 
prep-9.3% '(287) cohj 2.7% (84) adv 6.5% (200) 
noun" 25.6% (790 ) adj 13.5% (416 ) pron 8 . 9% (275) 
• 'nominal izat ions 2 % (47) ' 
sentence beginnings:^ . ■'- \ 

subject, opener: noun (52) pron (28) pos (0) adj (11) 
, ' - art -(20) tot 68% - . 

prep 6% (10). adv 10% (17) ^ • 

verb- 1% (2.) sub_conj 9% (15) can} 2% (3) ■ .: ' • 

expletives 3% (5) * . * ; 

< v Figure 2 : Output of STYLE Program Run on this Paper ' 
\ up to the Previous Paragraph- 



'spelli-ng checkers). (Notice, by .the way, that one is unlikely to 
inadvertently produce mistakes that are identifiable by the • 
program. , . * - , ' 

« -The makers of ; this program, would agree with'me, but they would 
argue that I am demanding" too much of it. The program "is not 
designed for such sophisticated users; it's designed for somebody 
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who might have a Flesch readability of- IS.Ojthe authors of the % 
Federalist Papers, says Cherry, p. 9) or somebody whose average 
sentence length is 38.4. ,f ?art of yo<ir object ion, they might s&y, 
is the fact that the output looks so impenetrable. "But' -that 
problem has been f ixed; outputs are .better and more explanatory. 
People are told that an average sentence length of 38.4 is way'out 
of line, and are 'asked , to rewrite. "Unfortunately, though, my 
objection remains. People whdse numbers are flagged for good 
reasons are not likely to be helped in any fundamental way by this 
syntactic fix, by putting in a bunch of- periods. Yet they are 
precisely tire people whose understanding of English is bad enough. 

. . - . .... .. ... - 0 

that they would take , the computer t „at its "word and, attempt to '"fix 
the text in the way suggested. Even for them, the syntactic 
information must therefore be made meaningful in terms of -the? 
content for it to do any good. Threy are'thus caught in the same 
bind, mentioned /before. They need to be able to interpret the 
output, before they can use the program; 

The DICTION program catches words' that ought to be eliminated 
or ought to have another word (given by the EXPLAIN program), 
substituted for it. it works much as spell routines do, with 
different .wordlists. Figure 3. shows some of the output of the 
DICTION program run on an earlier draft, of this article up to the 
same point. Let me caution you that I write in a somewhat .breezy, 
slangy -way, and that the Writer 1 s : Workbench is not particularly 
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sympathetic to chat kind of style. "• ~' ~* 

' x u 

/ ... * 

' . • <\ 

Like the SPELL ? program, DICTION produces output that is 
* 

inaccurate ("ludicrous" might be a, better word), so inaccurate that 
it. is useless. Would ..the program be more useful to others? 
.Perhaps, but the problem of sorting through the garbage is much 
more difficult. To use the program 1 properly, the .waiter must be' 
able to distinguish between correct and incorrect flags., perhaps "by 
substituting in the word suggested by EXPLAIN. Certainly , obvious 
garbage, like recommending that "meaningful " be eliminated in the. 
sample sentence, can be caught this way. But what about the - rest*? 
Almost none ,pf ;the recommended substitutions have the status of 
absolute rules, the way recommended changes in spell i ng do. The 
correct substitutions are merely, considered better by'most educated 
writers/ So, in order to .dist inguish between' the correct- and 'the 
incorrect, but possible substitutions, the .^wri t er. must have a sense 
of what most educated writers think is correct. The writer must, 
in other -words, be the sort of person who doesn 1 t real ly need the 
program, except to catch inadvertent errors,. ». I f the. writer isn't,, 
each flag poses a formidable problem, 

V - 

• x ' ' ■ ■ t 0 • 

In the original documentation for DSCTION, ■ Lorinda Cherry says 
•that in . its first release-, between 50 and 60% of the ^recommended ' 
.corrections^ were actually (oops?) made. She thinks/that that is a 
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We can, in other words, ask- the following question, a question ° * 
*[ which ]* is logically *[ prior to ]* the first question. 

This *.[ capability}* is used in word processors, formatters, and 
communications programs"/ ■ 

. *[ Essentially,]* I am saying that it is impossible for 

computers .to do certain things. . ■- , 

It is, however, "diff icult : ta, prove that Q is impossible, because 
*[ the "fact ]* that Q, hasn't been done is not *[ sufficient ]* 
proof.. . e , ^ 

When we look at a. text, we respond to it *[ on. the basis of 
.]* its meaning. . " . H ; ■ 

J This difference exists .no. matter what *[* kind of ]* text we 
are talking about . 

Whenever the computer appears to be manipulating 

*[ meaningful ]* .text, all it is *[ actual]*Ly doing is simulating 
the manipulation by manipul.at ing, syntactic indicators of the text." 

_ *[ In. the case of ]* single letters, -there is a' one-to-one 

correspondence between syntactic indicators and**[ meaningful"]* 
'•objects, so simulation is *[ quite 3 * easy and *[ very ]* ejxact,* 

« • ■ 

First, while- any particular text .may *[ in fact ]* receive tha 
correct. simulated response,- it is not possible to build a system 
*[ which ]* can be. relied on to respond correct ly - every time. 

Fourth, when the user, has to judge the output before makihg use 
-of.it, most, if not *[ all of '] * the utility of the"* program, is 
lost . v 

For another , they know less about sipell ing„ than "I do, so there 
are. more points where they must *[ "check 'on. ] * the. program. • 

* _ 

. The program is not designed for such *[ sophisticated. ]* users ; 
it 1 s- designed for somebody who might have a readability of 13. 

8 *[ (the authors ]* of the Federal ist Papers , says Cherry ) f or * 
. somebody whose average sentence length is 381 ' number of 
sentences 228 number of phrases found 59 

Figure 3: Output of the DICTION Prograjn Run on this Paper 
V. up ..to the>Previous -paragraph 
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* ■ " ■ ' . .v. * ■ <• 

t 1 - "' 

sign of the program's usefulness; " ; *'On the contrary. If £he, 
printout .above is any indication, far more than 50% of what it 

catches' is garbage. Thesa scientists at" Be 1-1 Labs must not know 

i. . _ * ^ 

the rules and .are taking the computer 1 s* word for it^ They are ' - 
doing '-the same thing the bad speller^does , abandoning their 
natural, correct way of speaking 'because they don't understand the' 
program or are too lazy to contradict it. *■ ; 

•- f • 

You can, of -course ,fteach people how to use these programs 

correctly. When you "do, you have also made the program useless, 

since people will then have learned (pretty much) to recognize 

these problems in their own prose, and when only inadvertent errors 

remain, the garbage ratio ils too high. This is not. such a bad 

result, but it's also not a very important one'. It-takes time- to 

teach people how to rea'd this - analysis of syntax, time to tpach ' 

them the list of easily dispensable words. That time- must be taken 

from other pedagogical tasks. I certainly never took this time in. 

the past, and 1^ don 1 1 see why the sudden availability of this' " 

program should change . that decision. I should add that teaching 

the: use of this program is not completely straight forward. Lorinda 

Cherry's idea of correct usage is not mine. > < 



6 v . •■ \ * 

Do the uninstructed actually use these programs? In a year and 

a' half of working in the computer rooms at M.I.T., I have never / 
seen anyone use- the STYLE, or DICTION programs. / j( ^ 
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a indeed, I think both programs are something of "a distraction. 
They give people/the idea that using these prpgrams is an efficient 
way of improving one's ; prose. It's not , "part icul'arly. One problem 

I have with writing this paper is that I am constantly having to 

- - . » 

run the 'DICTION program, since I want it to show ..sentences that you 
recognize. .. I amy* in other words, constantly revising the sentences 
that DICTION r-s flaggi-ng. Am I making the corrections it suggests? 
Clearly not. am. changing other things. > Airi I changing those 
sentences more then 'I'm changing other sentences. I thirk so. * , 
When. I read the sentences that have been flagged, I notice that 
things are wrong with them, and I change the sentences*. This 
suggests that randomly flagging sentences wi 11 produce more and 
better corrections than running the DICTION program, 

DICTION and STYLE are still doing very crude analyses of 
texts. Would a more powerful program be better? I have seen* one 
^such program, j IBM ' s EPISTLE. EPISTLE improves on the .crude 
analyses done by the Writer T s' Workbench programs by including more 
fine-grained tisjts of syntactic sequence arid adding syntact ic 
indicators of semantic content . Thus, EPISTLE will pick out 
witch/which spelling mistakes, because only one is syntactically' 
possible, and it will pass "do not know which, is better," because 
it correctly analyzes the function of which in that sentence; It 
wi^l^catch "My' father eat turkey on Thanksgiving" and might even 
catch "My father eats graftite on Thanksgiving. " Thojse are notable 
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accomplishments; grammar is very difficult to . analyze , because it's 

highly -meaning-dependent . 'Still, even though spelling accuracy 

- •■ * , * 7 

goes up, overall accuracy goes down* In the demonstration I saw ., 

the makers claimed that the program caught 70% Of the errors in 

."typical business letters." That means, of .course, that the. 

garbage ratio was 30% and that.it missed 30% of the actual (pops?), 

errors . • - 

* • 

■ • Is that acceptable? Surely, only marginally. The same 
dynamic is at work here . The person, who is good. at grammar doesn't 

* , a' 

i 

need it; the person who is. bad has more garbage to pick trirough and 
doesn't have the r^ou-rces to^decide the v quest ionable cases. 
EPISTLE does provide on-line explanations .<of Grundian strictness)', 
but if you think'about it,, even t\at can't be much help. In the. 
questionable cases, the on-line explanation is no more helpful, nor 
can it be, than the grammar book ' s. or. our own^and you know 
perfectly weTl how helpful they are. {Notice, of course, that our 
explanations are, considerably more trustw6rthy than the cdmputer £>'.', 
With our obiter dicta ^ the student at least knows that the 
judgments are cons idered, d and thus a student has some- mot ivat ion 
for trying\*e understand us. " With* this program, the judgments are : 
not necessari :ly , reasonable, and so the student can hsfte no such 
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motivation.) The questionable cases, remember, are; what matters. 
The obvious ga'rbage is not . useful . The obvious .mistakes* the ! 
mistakes due to inadvertence, are not worth the effort of catching.' 
<S • . - ■" , 

I should add that EPISTLE is not really meant for* u§. In its 
current version, the. program takes enormous amounts of mainframe 
time just to analyze a page. The program is meant to .bp sold to „ • 
companies who can afford it. In such organizations*, if my 
experience is any guide, the program i v s' lively to be imposed', 

• rather than used. Writers will be required 1 by harried supervisors 

* to have a Flesch score below 9 and .no^more than 2* £lags per page. : 

■ ■ • V- * . ' , 

Can Text Analyzers Be Made Better ?. ■ ^ * % lf> 

When I presented this argument, at the CCCC convent ion , there ^ 
were two reactions-. First, people tried to minimize the v 
% seriousness of the the inaccuracy. Teachers make^mistak^^too /^one / 
noted computer researcher told me. This is undeniable .^But . itX 
doesn't 'address tfhe. difference betweq^ the kinds of \he mistatcbs 
the two make. People always' make ' reasoned mistakes ; their 
reactions are based' on some .'reasoned application of a rule to a * 
text. Comput'er responses are not-reasoned; they apply entirely . 
different rules to a collection of- symbol st rin^s . " ■ Most of the 

time, the two kinds of appl icat ions correlate.. , Much of the time,- 

. » ... * * • 

they do not.. When we try. to evaluate a person's mistake, we are. 
* . 

always responding to their understanding of the rule and the 
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meaning. When we try to evaluate the computer's, we must first try 

** " * * * 

to' figure out whether there's a correlation- error. Not 

surprisingly,, it takes more knowledge and effort to do the latter, 

and. the project is not as rewarding. 

. The second reaction is more important. People want a 

• . . V .... c 

technical fix for' the orpbiem; they figure that -better programs 
will solve it. This is a reasonable , react ion , but it usually 
betrays a lack of understanding; of . the magnitude of the problem. 
Let me' try to give, you an intuitive sense of how big ttfie problem 
is,. ■' Any program that gives an accurate simulation of . a response to 
meaning must, at the very least, be able, to parse sentences 
correctly. The only way one can do this is to extend the approach 
that EPISTLE takes'. Consider, then, how* EPISTLE might parse the 
following sentence. 

1. The car hit the man in the street. 

The problem, for- the moment, is to f igure out the "function of "in 
the street, 11 " so that- the computer can choose correctly between the \ 
potential response strings, "Why did it hit the man there?" and 
"Why did it"' hit. that man?" There is a constraint. A programmer 
can gust arrange a correct response to this string by fiat. What 
we want, however, is a way which .also al lows us to parse similar 
sentences, like' the following: 

2 . The car hit the man in the side . . 
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3. The . car 'hit the man in the tree. 



4. The car hit the man in the park. " . f 

5. The car hit the man in the play. - - 

The. basic way of solving the problem is to multiply the number 
of. Syntactic indicators. A computer scientist might first classify 

all nouns according to" whether they define location -in space (park, 

■ .... . ■' ' * •* t "■ y . 

street), location in the body (side), location in" "activity (play), 

or no. location whatsoever* tMost nouns fall, into 'the latter class, 

e.g., "the car hit -the man ^in the noun.") Such a classification 

would necessarily be incomplete and ambiguous", but it would be a 

start. When the computer encounters a. prepositional phrase like 

"in" which takes a location "noun as its object , it would 

immediately consult this list. Nouns which .were not locations 

would generate a query. , Nouns which could be locations; would then 

generate an investigation of ..words in the rest of the sentence', 

which had previously also been classified. The program' might ' then 

lpok for correlations between types of nouns and types of 

locations. "Side" would go with man; "street" with car; the rest ' 

with hitting . ♦ -i • . v. 

This .wouldn't-, of course, produce accurate results, so the 
computer scientist would have to build a more complex 'system, one. 
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that also took into account t he ye r b . a.nd_.i nde ed -t ook . i-ntg account 
the specific noun-verb conjunction. " Such a system,. if it. were to 
be .general, would be stagger ingly '. large . Its granularity would 
have" to be so. fine that it could accurately distinguish among "The 
car hit," "The boy hit," The car passed, "' and "The boy passed," 
since each of those .produces a qui te ^different analysis when they 
are coupled with the original five prepositional phrases. Even 
when the computer scientist- has put "car hit" .into an "accident" - 
category, "boy hit" into a "fight" category, and so on, and 
attempted to structure the" categories so that "man" / and location 
.nouns would fit together with them-, he still has problems. it 
looks very much as if each noun must be fit to the event 
separately. In most cases, one would want to classify "tree," 
"park," and "street," as similar location nouns and treat them 
together. But in this case,. Our common sense about where cars 
usually go overrides this' classification and forces us to suspect 
that "in the tree" identifies where. the man had been, not the 
location of the event. Similar problems occur. even with o a"word 
like "side." . "Middle , 11 for instance, 'might usually be classified 
with "side," but that classification would not have enough 
granularity to take care of this problem. "The car hit the man in 
-the- middle" means something entirely different. Or consider what 
happens" if "side" is appropriately ^modified. In a sentence-like 
"The car hit the man in the east side of the parking lot," "in the 
side" should be parsed in the same way that "in the street" would 
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Notice that, even if we have a system that .produces the correct 
simulation in all these cases, that does not preclude failure., . If ' 
the previous sentence were, "Two men were walking side by side, one 
on the sidewalk and one in. the street, ,f then the parsing of the * 
original sentence would be incorrect. 

The ^ Overall - Prospects for Successful Simulation or Successful Use 
of Programs That ^ Simulate * 

It's easy to get lost in the details, so. let me sum up. The 
problem with computer simulation of "response to meaning' is that 
it's inaccurate. The existence of any inaccuracy of this kind 

requires that the user analyze each output for correctness oju 

-Simula t ion*, before ajnaljzjjis^ the .... - 

user T s text; Correct analysis of the problem cases requires so 
much. knowledge that the user is unlikely to- need the analysis in 
the first place, especially since the amount of obviously incorrect 
analysis (garbage^ is usually "high. Those users who don't have " 
this knowledge are arguably damaged as muqh ' as they are helped by 
such programs. And spending the time> necessary to learn them is • 
arguably not productive. ' , \ . •* 



Though I have not shown thiis here, the argument applies to 
other kinds of programs that in fact respond to meaning': invention 
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programs, template programs (a dutiful son of invention programs. 

for teachers or managers who want to be' sure they get it right.), 
' " •■ 8 . • ■ 

even idea processors. Interestingly enough— and important for my 

argument — the details matter; Each kind of simulation of meaning 

fails in a different, way. . 

The obvious solution -is to improve the accuracy of the 
programs. If we don 1 t t have %o check the simulation each time, 
because it's perfect,; then we*' c^n ' rely on the analysis and work 
with.it.. In the examples I've given, I've tried to show 'how very 
difficult "that is even yith ohe sentence. I've also tried to show 
that the methods one uses for getting the r ight ^response to one »~ 
sentence don't necessarily apply to other sentences, so that any 
fully accurate system has to build itself up sentence by sentence. 
This really, is* t.he • heart of the matter.. -Syntax (form) just doesn't 
have much to do with semantics (content). To ask syntax to 
simulate semantics is rather like asking a snail to fly. " * 

The consequences are very simple. If people do try to work 
with or design applications that require syntax" to simulate - 



8 < 

Yes, idea processors do respond,to meaning, as I . show in my 
"What Do Idea Processors Process?" 

9 " . 7" • ' • ' ; ■ . ' ' ■ " : 

See my "The Future of Computers in Writing and the Teaching of 
Writing," in Straightforward Writing , forthcoming/ 
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"^ifranTtre^-r^^ that application work by accepting a 

certain amount *of " inaccuFacy " and by working with enormpuiTuMlJers — 
of problems on a c^ae by case basis. "Whenever the number of 
possible sentences is, large or the amount of work involved in 
disentangling the meaning is great, attempts at simulation get 
overwhelmed. Any person who designs a simulation needs to say how 
he or she is going to get around, this problem. 

The simple way of seeing whether I am right is to re'ad the 
other papers in this volume. Many of -them are confronting, the 
problem of s imulatTng response to' meaning;, each of those 
researchers finds that the problem is a serious impediment . Each 
researcher has his or her own way of getting arotind the problem, 
but- none of them solve the problem, and none escape the 
consequences I describe. I wrote this " paper , by the way, well 
before I had read the other papers'. 

I did, however, have a clue about what would be in them. So 
far, my .argument rests on the obvious difficulty of getting * 
semantics, to simulate syntax. "But," 'you might say, "hoW do I know 
that the difficulty 'is so great. Maybe computer scientists, in 
particular, artificial-intelligence specialists can tell us *- 
something more." My "argument , however,, is actually'a little 
stronger than . that, because it's partly based on the record of 
artificial intelligence', research . in AI , you see, precisely the 
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problem we. are concerned with, theproblem of response to meaning 

•3 \ ■ 

has been central since- the beginning. Three of the. central 

problem^ recognit ion < and 

computer Vision) require solutions to the : ~^iearplTg^-pirebi^ of 

the central problems have been solved; all provisional solutions 

are merely convenient ways" of coping with , 1 imited problems, hot 

principled solutions. And AI research is thirty years old. 

Skilled' computer sci'enti'Sts have been looking for thirty years for 

'the solution to the problem we are just_ starting to encounter, and 

they /haven't found it. Claims, therefore, that progress is being 
. • ; , 10 * ■* < 

made "should 'be looked at with a j aundiced eye . 

My interpretation of AI , an- interpretation which follows 
Dreyfus, is quite pessimist ic - But even the most optimistic 
proponents of AI acknowledge that solving the meaning probl'em, 
getting computers to simulate responses to the meaning, of text-s, is 
probably the hardest problem' fac ing AI, the probl em that will be. 
solved last . . Patrick Winston, a' noted proponent of AI , told me, 

when I discussed computer programs to help writing', that it would 

• .1-1 . 

be thirty years before- a solution is eVe.n approached.' You can 



10 . /. . , ; - , - 
The standard account of progress, or lack thereof toward- the 

goal of solving, the meaning probTem is Hubert L. Dreyfus, What 
Computers Can 1 1 Do (New York:. Harper & Row, rev. ed. 1979). 

11 , . • ' " \ ... ; " ... : 
Personal communifeat ion, March 13/ 1984. 
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safely • assume this. is a minimum, not, a maximum f i'gure". ,"• 

This simple fact is* not. a cause for despair or even alarm.. It 
should actually be encourag ing-. Knowing that we can't effectively 
wri te . programs that . respond to text, we can- direct our energies 
toward more^TTuirlr^ goals.. We can make better 

On-line editors, better systems for cornmuincarrcrfb,- bette'r 
word-processors. I. That- is what I think we should do. It is not, 
however , just me who thinks so. ,1 went around t the AI department ' at 
MIT soon after I wrote this paper , . and . I asked . them what projects 
we in - the Writing Program ..should Undertake. "What kinds of 
programs .should we give , students , I asked.' 1 Here are answers from 
two noted computer scientists, whom I will let have the last word. 

Better word processors ' • 

— . Joseph We Lzenbaum 

Better word processors 

— , Douglas Hofstadter 



